73 research outputs found
Understanding Anatomy Classification Through Attentive Response Maps
One of the main challenges for broad adoption of deep learning based models
such as convolutional neural networks (CNN), is the lack of understanding of
their decisions. In many applications, a simpler, less capable model that can
be easily understood is favorable to a black-box model that has superior
performance. In this paper, we present an approach for designing CNNs based on
visualization of the internal activations of the model. We visualize the
model's response through attentive response maps obtained using a fractional
stride convolution technique and compare the results with known imaging
landmarks from the medical literature. We show that sufficiently deep and
capable models can be successfully trained to use the same medical landmarks a
human expert would use. Our approach allows for communicating the model
decision process well, but also offers insight towards detecting biases.Comment: Accepted at ISBI, 201
Diffusion Variational Autoencoders
A standard Variational Autoencoder, with a Euclidean latent space, is
structurally incapable of capturing topological properties of certain datasets.
To remove topological obstructions, we introduce Diffusion Variational
Autoencoders with arbitrary manifolds as a latent space. A Diffusion
Variational Autoencoder uses transition kernels of Brownian motion on the
manifold. In particular, it uses properties of the Brownian motion to implement
the reparametrization trick and fast approximations to the KL divergence. We
show that the Diffusion Variational Autoencoder is capable of capturing
topological properties of synthetic datasets. Additionally, we train MNIST on
spheres, tori, projective spaces, SO(3), and a torus embedded in R3. Although a
natural dataset like MNIST does not have latent variables with a clear-cut
topological structure, training it on a manifold can still highlight
topological and geometrical properties.Comment: 10 pages, 8 figures Added an appendix with derivation of asymptotic
expansion of KL divergence for heat kernel on arbitrary Riemannian manifolds,
and an appendix with new experiments on binarized MNIST. Added a previously
missing factor in the asymptotic expansion of the heat kernel and corrected a
coefficient in asymptotic expansion KL divergence; further minor edit
Anomaly Detection for imbalanced datasets with Deep Generative Models
Many important data analysis applications present with severely imbalanced
datasets with respect to the target variable. A typical example is medical
image analysis, where positive samples are scarce, while performance is
commonly estimated against the correct detection of these positive examples. We
approach this challenge by formulating the problem as anomaly detection with
generative models. We train a generative model without supervision on the
`negative' (common) datapoints and use this model to estimate the likelihood of
unseen data. A successful model allows us to detect the `positive' case as low
likelihood datapoints.
In this position paper, we present the use of state-of-the-art deep
generative models (GAN and VAE) for the estimation of a likelihood of the data.
Our results show that on the one hand both GANs and VAEs are able to separate
the `positive' and `negative' samples in the MNIST case. On the other hand, for
the NLST case, neither GANs nor VAEs were able to capture the complexity of the
data and discriminate anomalies at the level that this task requires. These
results show that even though there are a number of successes presented in the
literature for using generative models in similar applications, there remain
further challenges for broad successful implementation.Comment: 15 pages, 13 figures, accepted by Benelearn 2018 conferenc
BSDAR: Beam Search Decoding with Attention Reward in Neural Keyphrase Generation
This study mainly investigates two decoding problems in neural keyphrase
generation: sequence length bias and beam diversity. We introduce an extension
of beam search inference based on word-level and n-gram level attention score
to adjust and constrain Seq2Seq prediction at test time. Results show that our
proposed solution can overcome the algorithm bias to shorter and nearly
identical sequences, resulting in a significant improvement of the decoding
performance on generating keyphrases that are present and absent in source
text
Equivariant Neural Simulators for Stochastic Spatiotemporal Dynamics
Neural networks are emerging as a tool for scalable data-driven simulation of
high-dimensional dynamical systems, especially in settings where numerical
methods are infeasible or computationally expensive. Notably, it has been shown
that incorporating domain symmetries in deterministic neural simulators can
substantially improve their accuracy, sample efficiency, and parameter
efficiency. However, to incorporate symmetries in probabilistic neural
simulators that can simulate stochastic phenomena, we need a model that
produces equivariant distributions over trajectories, rather than equivariant
function approximations. In this paper, we propose Equivariant Probabilistic
Neural Simulation (EPNS), a framework for autoregressive probabilistic modeling
of equivariant distributions over system evolutions. We use EPNS to design
models for a stochastic n-body system and stochastic cellular dynamics. Our
results show that EPNS considerably outperforms existing neural network-based
methods for probabilistic simulation. More specifically, we demonstrate that
incorporating equivariance in EPNS improves simulation quality, data
efficiency, rollout stability, and uncertainty quantification. We conclude that
EPNS is a promising method for efficient and effective data-driven
probabilistic simulation in a diverse range of domains.Comment: Accepted to NeurIPS 202
Evolutionary Construction of Convolutional Neural Networks
Neuro-Evolution is a field of study that has recently gained significantly
increased traction in the deep learning community. It combines deep neural
networks and evolutionary algorithms to improve and/or automate the
construction of neural networks. Recent Neuro-Evolution approaches have shown
promising results, rivaling hand-crafted neural networks in terms of accuracy.
A two-step approach is introduced where a convolutional autoencoder is created
that efficiently compresses the input data in the first step, and a
convolutional neural network is created to classify the compressed data in the
second step. The creation of networks in both steps is guided by by an
evolutionary process, where new networks are constantly being generated by
mutating members of a collection of existing networks. Additionally, a method
is introduced that considers the trade-off between compression and information
loss of different convolutional autoencoders. This is used to select the
optimal convolutional autoencoder from among those evolved to compress the data
for the second step. The complete framework is implemented, tested on the
popular CIFAR-10 data set, and the results are discussed. Finally, a number of
possible directions for future work with this particular framework in mind are
considered, including opportunities to improve its efficiency and its
application in particular areas
Calibrated Adversarial Training
Adversarial training is an approach of increasing the robustness of models to adversarial attacks by including adversarial examples in the training set. One major challenge of producing adversarial examples is to contain sufficient perturbation in the example to flip the model's output while not making severe changes in the example's semantical content. Exuberant change in the semantical content could also change the true label of the example. Adding such examples to the training set results in adverse effects. In this paper, we present the Calibrated Adversarial Training, a method that reduces the adverse effects of semantic perturbations in adversarial training. The method produces pixel-level adaptations to the perturbations based on novel calibrated robust error. We provide theoretical analysis on the calibrated robust error and derive an upper bound for it. Our empirical results show a superior performance of the Calibrated Adversarial Training over a number of public datasets.</p
- …